[Refactor] Refine AST to preserve JOIN and ORDER BY syntax + Unary operator support#101
Merged
colinthebomb1 merged 5 commits intomainfrom Mar 20, 2026
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR refines the SQL AST so the parser/formatter can preserve syntactic distinctions needed for faithful SQL round-tripping (bare JOIN vs INNER JOIN, implicit vs explicit ORDER BY ... ASC) and introduces a dedicated unary-operator node.
Changes:
- Added
JoinType.JOINand updated parser/formatter to distinguish bareJOINfromINNER JOIN. - Changed
OrderByItemNode.sortdefault toNoneand updated parser/formatter to only emit sort direction when explicitly present. - Added
UnaryOperatorNodeand updated parser/expected ASTs to represent unary operators (e.g.,NOT, unary-) distinctly.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_query_parser.py | Adjusts parser test expectations for updated AST semantics (but currently disables a key assertion). |
| tests/test_ast.py | Updates AST unit tests to use UnaryOperatorNode. |
| data/asts.py | Updates expected ASTs (notably queries 37, 42, 44) to match new JOIN/ORDER BY/unary semantics. |
| core/query_parser.py | Updates JOIN type parsing, ORDER BY sort parsing, and unary operator parsing into UnaryOperatorNode. |
| core/query_formatter.py | Updates join-type emission, ORDER BY emission rules, and unary operator formatting. |
| core/ast/node.py | Tightens OperatorNode construction and introduces UnaryOperatorNode; makes ORDER BY sort optional. |
| core/ast/enums.py | Adds JoinType.JOIN. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Collaborator
Author
|
Unary operator LGTM! |
baiqiushi
approved these changes
Mar 20, 2026
Contributor
baiqiushi
left a comment
There was a problem hiding this comment.
Thanks @colinthebomb1 !
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This PR refines the AST to preserve syntactic distinctions that
mo_sql_parsingtreats as semantically different, ensuring faithful SQL roundtripping through the parser and formatter.Code Changes
Added
JoinType.JOINenum value to distinguish bareJOINfrom explicitINNER JOIN(previously both mapped toJoinType.INNER)Changed
OrderByItemNodedefault_sortfromSortOrder.ASCtoNoneto distinguish implicit ordering from explicitORDER BY x ASCUpdated parser to return
JoinType.JOINfor bareJOINandNonesort for ORDER BY items without an explicit directionUpdated formatter join type map to include
JoinType.JOIN -> 'join'andJoinType.INNER -> 'inner join', and simplified ORDER BY logic to only emit sort when explicitly specifiedUpdated expected ASTs in
data/asts.pyfor queries 37 and 44 to match the new semanticsAdded
UnaryOperatorNodeas a dedicated node type for unary operators (e.g.NOT, unary-), and tightenedOperatorNodeso it always has a non-None left operand.